{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# fewshot提示模板"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.prompts.few_shot import FewShotPromptTemplate\n",
    "from langchain.prompts.prompt import PromptTemplate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "examples = [\n",
    "    {\"input\": \"高\", \"output\": \"矮\"},\n",
    "    {\"input\": \"胖\", \"output\": \"瘦\"},\n",
    "    {\"input\": \"精力充沛\", \"output\": \"萎靡不振\"},\n",
    "    {\"input\": \"快乐\", \"output\": \"伤心\"},\n",
    "    {\"input\": \"黑\", \"output\": \"白\"},\n",
    "]\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们像构造提示词模板对象一样，构造一个普通的 PromptTemplate 对象。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "example_prompt=PromptTemplate(input_variables=[\"input\",\"output\"],\n",
    "    template=\"\"\"\n",
    "词语:  {input}\\n\n",
    "反义词:  {output}\\n\n",
    "\"\"\")\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'\\n词语:  高\\n\\n反义词:  矮\\n\\n'"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "example_prompt.format(**examples[0])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "调用 format 方法，填入 input 和 output 参数。 当你写`example_prompt.format(**examples[0])`时，`**examples[0]`会将第一个字典的键值对解开，然后作为关键字参数传递给format方法。这等价于example_prompt.format(input=\"高\", output=\"矮\")。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'我们来玩个反义词游戏，我说词语，你说它的反义词\\n\\n\\n词语:  高\\n\\n反义词:  矮\\n\\n\\n\\n词语:  胖\\n\\n反义词:  瘦\\n\\n\\n\\n词语:  精力充沛\\n\\n反义词:  萎靡不振\\n\\n\\n\\n词语:  快乐\\n\\n反义词:  伤心\\n\\n\\n\\n词语:  黑\\n\\n反义词:  白\\n\\n\\n现在轮到你了，词语: 好\\n反义词: '"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "few_shot_prompt = FewShotPromptTemplate(\n",
    "  examples=examples,\n",
    "  example_prompt=example_prompt,\n",
    "  example_separator=\"\\n\",\n",
    "  prefix=\"我们来玩个反义词游戏，我说词语，你说它的反义词\\n\",\n",
    "  suffix=\"现在轮到你了，词语: {input}\\n反义词: \",\n",
    "  input_variables=[\"input\"],\n",
    ")\n",
    "few_shot_prompt.format(input=\"好\")\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们开始实例化一个 FewShotPromptTemplate 对象，引入 FewShotPromptTemplate 类后，直接使用传参函数式调用。值得注意的是 example单词开始的几个参数。\n",
    "- 参数 `examples` 是一个字典列表，其中每个字典包含两个键值对：{\"input\": \"高\", \"output\": \"矮\"}。\n",
    "- 参数 `example_prompt` 是一个PromptTemplate 对象，不是一个提示词字符串，而是 PromptTemplate 类实例化的对象。\n",
    "- 参数 `example_separator` 是例子之间使用什么分割符号： \\n 代表例子与例子之间是一个空行。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "'suffix'",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "Cell \u001b[1;32mIn[6], line 1\u001b[0m\n\u001b[1;32m----> 1\u001b[0m no_suffix_few_shot_prompt \u001b[38;5;241m=\u001b[39m \u001b[43mFewShotPromptTemplate\u001b[49m\u001b[43m(\u001b[49m\n\u001b[0;32m      2\u001b[0m \u001b[43m  \u001b[49m\u001b[43mexamples\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mexamples\u001b[49m\u001b[43m,\u001b[49m\n\u001b[0;32m      3\u001b[0m \u001b[43m  \u001b[49m\u001b[43mexample_prompt\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mexample_prompt\u001b[49m\u001b[43m,\u001b[49m\n\u001b[0;32m      4\u001b[0m \u001b[43m  \u001b[49m\u001b[43mexample_separator\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;130;43;01m\\n\u001b[39;49;00m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[0;32m      5\u001b[0m \u001b[43m  \u001b[49m\u001b[43mprefix\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43m我们来玩个反义词游戏，我说词语，你说它的反义词\u001b[39;49m\u001b[38;5;130;43;01m\\n\u001b[39;49;00m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[0;32m      6\u001b[0m \u001b[43m  \u001b[49m\u001b[43minput_variables\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43minput\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\n\u001b[0;32m      7\u001b[0m \u001b[43m)\u001b[49m\n\u001b[0;32m      8\u001b[0m no_suffix_few_shot_prompt\u001b[38;5;241m.\u001b[39mformat(\u001b[38;5;28minput\u001b[39m\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m好\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
      "File \u001b[1;32m~\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python311\\site-packages\\langchain_core\\load\\serializable.py:97\u001b[0m, in \u001b[0;36mSerializable.__init__\u001b[1;34m(self, **kwargs)\u001b[0m\n\u001b[0;32m     96\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m__init__\u001b[39m(\u001b[38;5;28mself\u001b[39m, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs: Any) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m---> 97\u001b[0m     \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[38;5;21;43m__init__\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m     98\u001b[0m     \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_lc_kwargs \u001b[38;5;241m=\u001b[39m kwargs\n",
      "File \u001b[1;32m~\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python311\\site-packages\\pydantic\\main.py:339\u001b[0m, in \u001b[0;36mpydantic.main.BaseModel.__init__\u001b[1;34m()\u001b[0m\n",
      "File \u001b[1;32m~\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python311\\site-packages\\pydantic\\main.py:1102\u001b[0m, in \u001b[0;36mpydantic.main.validate_model\u001b[1;34m()\u001b[0m\n",
      "File \u001b[1;32m~\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python311\\site-packages\\langchain_core\\prompts\\few_shot.py:117\u001b[0m, in \u001b[0;36mFewShotPromptTemplate.template_is_valid\u001b[1;34m(cls, values)\u001b[0m\n\u001b[0;32m    108\u001b[0m     check_valid_template(\n\u001b[0;32m    109\u001b[0m         values[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mprefix\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;241m+\u001b[39m values[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msuffix\u001b[39m\u001b[38;5;124m\"\u001b[39m],\n\u001b[0;32m    110\u001b[0m         values[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtemplate_format\u001b[39m\u001b[38;5;124m\"\u001b[39m],\n\u001b[0;32m    111\u001b[0m         values[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124minput_variables\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;241m+\u001b[39m \u001b[38;5;28mlist\u001b[39m(values[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mpartial_variables\u001b[39m\u001b[38;5;124m\"\u001b[39m]),\n\u001b[0;32m    112\u001b[0m     )\n\u001b[0;32m    113\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m values\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtemplate_format\u001b[39m\u001b[38;5;124m\"\u001b[39m):\n\u001b[0;32m    114\u001b[0m     values[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124minput_variables\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;241m=\u001b[39m [\n\u001b[0;32m    115\u001b[0m         var\n\u001b[0;32m    116\u001b[0m         \u001b[38;5;28;01mfor\u001b[39;00m var \u001b[38;5;129;01min\u001b[39;00m get_template_variables(\n\u001b[1;32m--> 117\u001b[0m             values[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mprefix\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;241m+\u001b[39m \u001b[43mvalues\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43msuffix\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m]\u001b[49m, values[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtemplate_format\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n\u001b[0;32m    118\u001b[0m         )\n\u001b[0;32m    119\u001b[0m         \u001b[38;5;28;01mif\u001b[39;00m var \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;129;01min\u001b[39;00m values[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mpartial_variables\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n\u001b[0;32m    120\u001b[0m     ]\n\u001b[0;32m    121\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m values\n",
      "\u001b[1;31mKeyError\u001b[0m: 'suffix'"
     ]
    }
   ],
   "source": [
    "no_suffix_few_shot_prompt = FewShotPromptTemplate(\n",
    "  examples=examples,\n",
    "  example_prompt=example_prompt,\n",
    "  example_separator=\"\\n\",\n",
    "  prefix=\"我们来玩个反义词游戏，我说词语，你说它的反义词\\n\",\n",
    "  input_variables=[\"input\"],\n",
    ")\n",
    "no_suffix_few_shot_prompt.format(input=\"好\")\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "上面的代码我去掉了： 后缀 suffix=\"词语: {input}\\n反义词: \" ， 出现报错，说明这个参数是不能省略的。  因为input_variables=[\"input\"], 指定的输入是input变量。只有在suffix 参数中显示了这个变量存放的位置，以及告知 FewShotPromptTemplate 类，实例化的时候，需要填充 input变量 到模板里面。"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 在Chain中使用"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.llms import OpenAI\n",
    "from langchain.chains import LLMChain"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "chain = LLMChain(llm=OpenAI(openai_api_key=\"填入您的OpenAI开发密钥\"),\n",
    " prompt=few_shot_prompt )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "' 热'"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain.run(\"冷\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在这段代码中，我们首先实例化了一个LLMChain对象。这个对象是LangChain库中的一个核心组件，可以理解为一个执行链，它将各个步骤连接在一起，形成一个完整的运行流程。LLMChain对象在实例化时需要两个关键参数：一个是llm，这里我们使用了OpenAI提供的大型语言模型；另一个是prompt，这里我们传入的是我们刚刚创建的few_shot_prompt对象。"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  基于长度选择的实例选择器：LengthBasedExampleSelector"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.prompts.example_selector import LengthBasedExampleSelector"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "LengthBasedExampleSelector(examples=[{'input': '高', 'output': '矮'}, {'input': '胖', 'output': '瘦'}, {'input': '精力充沛', 'output': '萎靡不振'}, {'input': '快乐', 'output': '伤心'}, {'input': '黑', 'output': '白'}], example_prompt=PromptTemplate(input_variables=['input', 'output'], template='\\n词语:  {input}\\n\\n反义词:  {output}\\n\\n'), get_text_length=<function _get_length_based at 0x000002D7F3935120>, max_length=25, example_text_lengths=[10, 10, 10, 10, 10])"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "example_selector = LengthBasedExampleSelector(\n",
    "  examples=examples, \n",
    "  example_prompt=example_prompt, \n",
    "  max_length=25,\n",
    ")\n",
    "example_selector"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'我们来玩个反义词游戏，我说词语，你说它的反义词\\n\\n\\n词语:  高\\n\\n反义词:  矮\\n\\n\\n\\n词语:  胖\\n\\n反义词:  瘦\\n\\n\\n现在轮到你了，词语: 好\\n反义词: '"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "example_selector_prompt = FewShotPromptTemplate(\n",
    "  example_selector=example_selector,\n",
    "  example_prompt=example_prompt,\n",
    "  example_separator=\"\\n\",\n",
    "  prefix=\"我们来玩个反义词游戏，我说词语，你说它的反义词\\n\",\n",
    "  suffix=\"现在轮到你了，词语: {input}\\n反义词: \",\n",
    "  input_variables=[\"input\"],\n",
    ")\n",
    "example_selector_prompt.format(input=\"好\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们发现：examples列表中这些键值对并没有出现在提示词中：\n",
    "{\"input\": \"精力充沛\", \"output\": \"萎靡不振\"},\n",
    "{\"input\": \"快乐\", \"output\": \"伤心\"},\n",
    "{\"input\": \"黑\", \"output\": \"白\"},"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "因为我们实例化这个少样本实例选择器的时候，选择的基于长度选择示例的 LengthBasedExampleSelector，并且实例化为一个对象，传参给了 FewShotPromptTemplate 的参数 example_selector， 并且设置了：max_length=25。如果示例的长度超过我们设置的最大长度，则会截断。\n",
    "\n",
    "我们可以改一下最大长度的参数为100：max_length=100 。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "LengthBasedExampleSelector(examples=[{'input': '高', 'output': '矮'}, {'input': '胖', 'output': '瘦'}, {'input': '精力充沛', 'output': '萎靡不振'}, {'input': '快乐', 'output': '伤心'}, {'input': '黑', 'output': '白'}], example_prompt=PromptTemplate(input_variables=['input', 'output'], template='\\n词语:  {input}\\n\\n反义词:  {output}\\n\\n'), get_text_length=<function _get_length_based at 0x000002D7F3935120>, max_length=100, example_text_lengths=[10, 10, 10, 10, 10])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "example_selector = LengthBasedExampleSelector(\n",
    "  examples=examples, \n",
    "  example_prompt=example_prompt, \n",
    "  max_length=100, # \n",
    ")\n",
    "example_selector"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "所有的示例都被展示了。"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[{'input': '高', 'output': '矮'}, {'input': '胖', 'output': '瘦'}, {'input': '精力充沛', 'output': '萎靡不振'}, {'input': '快乐', 'output': '伤心'}, {'input': '黑', 'output': '白'}]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 其他示例选择器"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在这些示例选择器中，实例化参数的确有所不同。它们都需要传入基础的参数，如examples和example_prompt，但根据选择器的不同，还有一些额外的参数需要设置。\n",
    "\n",
    "对于LengthBasedExampleSelector，除了examples和example_prompt外，还需要传递max_length参数来设置示例的最大长度。\n",
    "\n",
    "example_selector = LengthBasedExampleSelector(\n",
    "    examples=examples, \n",
    "    example_prompt=example_prompt, \n",
    "    max_length=25,\n",
    ")\n",
    "\n",
    "对于MaxMarginalRelevanceExampleSelector，除了传入示例（examples）外，还需要传入一个用于生成语义相似性测量的嵌入类（OpenAIEmbeddings()），一个用于存储嵌入和执行相似性搜索的VectorStore类（FAISS），以及需要生成的示例数量（k=2）。\n",
    "\n",
    "example_selector = MaxMarginalRelevanceExampleSelector.from_examples(\n",
    "    examples,\n",
    "    OpenAIEmbeddings(),\n",
    "    FAISS,\n",
    "    k=2,\n",
    ")\n",
    "\n",
    "对于NGramOverlapExampleSelector，除了examples和example_prompt外，还有一个threshold参数用于设定选择器的停止阈值。\n",
    "\n",
    "example_selector = NGramOverlapExampleSelector(\n",
    "    examples=examples,\n",
    "    example_prompt=example_prompt,\n",
    "    threshold=-1.0,\n",
    ")\n",
    "\n",
    "对于SemanticSimilarityExampleSelector，除了传入示例（examples）外，还需要传入一个用于生成语义相似性测量的嵌入类（OpenAIEmbeddings()），一个用于存储嵌入和执行相似性搜索的VectorStore类（Chroma），以及需要生成的示例数量（k=1）。\n",
    "\n",
    "example_selector = SemanticSimilarityExampleSelector.from_examples(\n",
    "    examples, \n",
    "    OpenAIEmbeddings(), \n",
    "    Chroma, \n",
    "    k=1\n",
    ")\n",
    "\n",
    "每种选择器都有其独特的参数设置，以满足不同的示例选择需求。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.7"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
